test: enable force inclusion, re-enable fi e2e and add fi benchmarks #3005

julienrbrt · 2026-01-22T10:52:20Z

ref: #2965 (comment)
ref: #2982 (comment)

github-actions · 2026-01-22T10:52:32Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Jan 23, 2026, 4:03 PM

gemini-code-assist · 2026-01-22T10:52:36Z

Summary of Changes

Hello @julienrbrt, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enabling and validating the 'force inclusion' functionality within the system. It removes a previous block in the configuration validation that prevented the feature from being active and subsequently re-enables the comprehensive suite of end-to-end tests designed to verify its correct operation across various EVM components.

Highlights

Force Inclusion Feature Activation: The 'forced inclusion' feature has been activated by removing a temporary error message that previously prevented its use in pkg/config/config.go. The configuration now correctly validates the ForcedInclusionNamespace.
E2E Tests Re-enabled: All end-to-end tests related to EVM force inclusion, including sequencer, full node, and malicious sequencer scenarios, have been re-enabled in test/e2e/evm_force_inclusion_e2e_test.go by removing t.Skip() directives.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

claude · 2026-01-22T10:52:38Z

Claude finished @julienrbrt's task in 2m 36s —— View job

Code Review Complete

I've completed a comprehensive review of PR #3005. Below is my detailed analysis organized by focus area.

Overview

This PR enables the force inclusion feature by:

Removing the validation block that prevented forced inclusion namespace configuration
Re-enabling E2E tests for force inclusion scenarios (2 of 3 tests)
Adding comprehensive benchmarks for FilterTxs with/without force inclusion
Fixing tracing return values to preserve partial results on error

Files Changed:

pkg/config/config.go - Enable forced inclusion namespace validation
test/e2e/evm_force_inclusion_e2e_test.go - Re-enable E2E tests
execution/evm/filter_bench_test.go - NEW comprehensive benchmarks
Multiple go.mod/go.sum files - Dependency bump (gnark-crypto)
Logging cleanup across apps - Remove redundant forced inclusion logs
Tracing fixes - Preserve return values on error

1. Code Quality & Architecture

✅ Strengths

Benchmark Design (execution/evm/filter_bench_test.go:210-511)

Excellent benchmark coverage with 7 distinct scenarios testing different aspects of FilterTxs
Uses Go 1.23's b.Loop() - modern and correct approach
Properly isolates setup from measurement with b.ResetTimer()
Reports allocations with b.ReportAllocs() - essential for performance analysis
Clean helper functions with proper b.Helper() usage

E2E Test Quality (test/e2e/evm_force_inclusion_e2e_test.go)

Well-documented test scenarios with clear flow descriptions
Tests cover sequencer-only and full-node sync scenarios
Proper use of require.Eventually for async operations with reasonable timeouts
Good test isolation with temp directories

Config Validation (pkg/config/config.go:295-299)

Correctly validates forced inclusion namespace when provided
Consistent with other namespace validation patterns
Clear error messages with context

⚠️ Areas for Improvement

Hardcoded Test Credentials (execution/evm/filter_bench_test.go:18-20)

const (
	benchPrivateKey = "cece4f25ac74deb1468965160c7185e07dff413f23fcadb611b05ca37ab0a52e"
	benchToAddress  = "0x944fDcD1c868E3cC566C78023CcB38A32cDA836E"
	benchChainID    = "1234"
)

While this is for benchmarks, consider adding a comment indicating these are test-only values with no real funds. This prevents any confusion about credential exposure.

Missing Documentation (execution/evm/filter_bench_test.go)
The benchmark file lacks package-level documentation explaining:

What FilterTxs does and why these benchmarks matter
The performance implications of force inclusion (UnmarshalBinary overhead)
Expected baseline performance metrics

One Test Still Skipped (test/e2e/evm_force_inclusion_e2e_test.go:402)

func TestEvmSyncerMaliciousSequencerForceInclusionE2E(t *testing.T) {
	t.Skip() // TODO fix before merge

The malicious sequencer detection test is still disabled. This is a critical security test. Consider:

Filing a follow-up issue to track this
Adding a comment explaining what needs to be fixed

2. Error Handling & Resilience

✅ Strengths

Tracing Fixes (block/internal/executing/tracing.go:78-100)
The tracing wrapper now correctly preserves return values when errors occur:

return header, data, err  // ✅ Previously returned nil, nil, err

This is critical for error handling patterns where partial results are meaningful (e.g., ErrNoTransactionsInBatch). The accompanying tests verify this behavior.

Benchmark Error Handling (execution/evm/filter_bench_test.go)
Benchmarks properly use b.Fatalf() for fatal errors during setup and execution.

⚠️ Potential Issues

Silent Invalid Transaction Filtering (execution/evm/execution.go:436-443)
When hasForceIncludedTransaction is true and UnmarshalBinary fails:

if err := ethTx.UnmarshalBinary(tx); err != nil {
	c.logger.Debug().Msg("filtering out invalid transaction (gibberish)")
	result[i] = execution.FilterRemove
	continue
}

Issue: Uses Debug level logging for potentially malicious data on DA layer. Consider:

Using Warn level when source is forced inclusion (DA) vs mempool
Adding metrics to track filtered force-included transactions
Including transaction position/hash for investigation

E2E Test Timeout Variations (test/e2e/evm_force_inclusion_e2e_test.go)
Different timeouts across similar operations:

Line 172: 30 seconds for forced tx inclusion
Line 257: 20 seconds for normal tx sync
Line 272: 40 seconds for forced tx sync

While these account for different complexities, consider documenting why timeouts vary or using named constants for clarity.

3. Security

✅ Strengths

Namespace Validation (pkg/config/config.go:295-299)
Properly validates forced inclusion namespace format using Celestia's namespace validation, preventing malformed configurations.

Test Private Keys
All private keys in tests/benchmarks are clearly for testing purposes with no real funds.

⚠️ Considerations

Force Inclusion Attack Surface
The force inclusion feature allows anyone to submit transactions to DA. Consider documenting:

Rate limiting mechanisms (if any)
Cost implications for DA submission
DoS mitigation strategies
Maximum transaction size limits

Malicious Sequencer Test Disabled (test/e2e/evm_force_inclusion_e2e_test.go:402)
The test verifying sync nodes detect malicious sequencers is skipped. This is a critical security property - sync nodes must detect when sequencers censor forced transactions. Re-enabling this test should be a priority.

Transaction Validation Order
In FilterTxs, size validation happens before unmarshaling when hasForceIncludedTransaction=false:

// Skip tx that can never make it in a block (too big)
if maxBytes > 0 && txBytes > maxBytes {
	result[i] = execution.FilterRemove
	continue
}

This is correct - it prevents resource exhaustion from parsing oversized transactions. ✅

4. Performance & Resource Efficiency

✅ Strengths

Smart Conditional Unmarshaling (execution/evm/execution.go:426-444)

if hasForceIncludedTransaction {
	var ethTx types.Transaction
	if err := ethTx.UnmarshalBinary(tx); err != nil {
		// ...
	}
	txGas = ethTx.Gas()
}

Excellent optimization: Only unmarshals when force-included transactions are present, since mempool transactions are already validated. The benchmarks quantify this overhead.

Benchmark Coverage

Tests both code paths (with/without unmarshal)
Various batch sizes (100, 1000, 10000)
Gas and size limit scenarios
Invalid transaction handling
Direct comparison benchmark (BenchmarkFilterTxs_CompareUnmarshalOverhead)

⚠️ Performance Considerations

Allocation Patterns (execution/evm/filter_bench_test.go:253-292)

func generateSignedTransaction(b *testing.B, nonce uint64, gasLimit uint64) []byte {
	data := make([]byte, 16)
	if _, err := rand.Read(data); err != nil {
		b.Fatalf("failed to generate random data: %v", err)
	}
	// ... creates new transaction each time
}

Benchmark generates fresh transactions on each iteration. This is correct for realistic allocation testing, but consider:

Pre-generating transaction batches for some benchmarks to isolate FilterTxs overhead
Adding a benchmark comment explaining this design choice

Missing Benchmark Baselines
The benchmark file doesn't include expected performance numbers or regression thresholds. Consider adding comments like:

// Expected: ~5000 ns/op for 1000 txs without unmarshal
// Expected: ~15000 ns/op for 1000 txs with unmarshal (3x overhead)

10k Transaction Test (execution/evm/filter_bench_test.go:336)
Testing with 10,000 transactions is excellent for catching O(n²) issues, but ensure CI has sufficient resources for this benchmark.

5. Testing & Reliability

✅ Strengths

Comprehensive Benchmark Scenarios

BenchmarkFilterTxs_OnlyNormalTxs - Baseline (mempool only)
BenchmarkFilterTxs_WithForcedTxs - Force inclusion overhead
BenchmarkFilterTxs_MixedWithInvalidTxs - Error handling path
BenchmarkFilterTxs_WithGasLimit - Gas accounting
BenchmarkFilterTxs_WithSizeLimit - Byte size limits
BenchmarkFilterTxs_CompareUnmarshalOverhead - Direct comparison

Tracing Tests (block/internal/executing/tracing_test.go:113-208)
Excellent addition of tests verifying error cases return values correctly. This catches regressions in error handling patterns.

E2E Test Realism (test/e2e/evm_force_inclusion_e2e_test.go)

Tests actual EVM transactions with proper signing
Uses real DA layer (not mocked)
Tests P2P sync between nodes
Configures realistic epoch parameters (2 DA blocks)

⚠️ Coverage Gaps

Codecov Report Shows 0% Patch Coverage
The codecov bot reports 2 lines missing coverage in pkg/config/config.go. These are the newly enabled validation lines:

if err := validateNamespace(c.DA.GetForcedInclusionNamespace()); err != nil {
	return fmt.Errorf("could not validate forced inclusion namespace (%s): %w", c.DA.GetForcedInclusionNamespace(), err)
}

Consider adding a unit test for config validation with forced inclusion namespace (both valid and invalid cases).

Missing Benchmark: Empty Batch
No benchmark tests the case where hasForceIncludedTransaction=true but batch is empty. This edge case might have unexpected overhead.

Missing E2E: Invalid Forced Transaction
The E2E tests submit valid transactions. Consider testing:

Submitting invalid/malformed transactions to force inclusion namespace
Verifying they're filtered correctly
Checking sequencer remains stable

6. Documentation & Developer Experience

✅ Strengths

Excellent E2E Test Documentation (test/e2e/evm_force_inclusion_e2e_test.go:366-400)
The malicious sequencer test has superb documentation:

Architecture explanation
Test flow with step numbers
Key configuration differences
Expected outcomes
Security implications clearly stated

Function Comments
Helper functions in benchmarks and E2E tests have clear, concise comments explaining purpose.

⚠️ Could Be Better

Missing CHANGELOG Entry
This is a significant feature enablement. Consider adding a CHANGELOG entry:

## [Version]
### Added
- Force inclusion feature now enabled for production use
- Comprehensive benchmarks for FilterTxs performance with force inclusion
- E2E tests for force inclusion scenarios (sequencer and full node)

Benchmark Package Documentation
Add package-level doc explaining FilterTxs performance characteristics:

// Package evm provides benchmarks for EVM execution layer operations.
//
// FilterTxs Benchmarks:
// These benchmarks measure the performance impact of force inclusion transactions,
// which require UnmarshalBinary validation. Normal mempool transactions skip this
// step as they're pre-validated, resulting in ~3x faster filtering.

Config Field Documentation (pkg/config/config.go:189)
The ForcedInclusionNamespace field comment could be more detailed:

ForcedInclusionNamespace string `mapstructure:"forced_inclusion_namespace" yaml:"forced_inclusion_namespace" comment:"Namespace ID for forced inclusion transactions on the DA layer. When set, enables retrieval of censorship-resistant transactions from DA."`

Critical Issues to Address Before Merge

🔴 High Priority

Malicious Sequencer Test Disabled (test/e2e/evm_force_inclusion_e2e_test.go:402)
- This tests a critical security property
- Either fix and enable, or file a tracking issue with detailed blockers
- Add comment explaining what needs to be fixed
Test Coverage for Config Validation (pkg/config/config.go:295-299)
- Add unit tests for forced inclusion namespace validation
- Test both valid and invalid namespace formats
- This will address the codecov 0% patch coverage

🟡 Medium Priority

Logging Level for Invalid Force-Included Transactions
- Consider Warn instead of Debug for DA-sourced invalid transactions
- Add metrics to track filtered force-included transactions
- Document expected vs malicious failure rates
Performance Baselines
- Add expected performance ranges in benchmark comments
- Define regression thresholds for CI
- Document acceptable overhead for force inclusion path
E2E Test for Invalid Forced Transactions
- Test submitting malformed data to force inclusion namespace
- Verify proper filtering and system stability

Recommendations

🎯 Before Merge

✅ Address critical issues above (malicious sequencer test, config test coverage)
Consider adding CHANGELOG entry
Add package-level documentation to benchmark file
File follow-up issues for medium-priority items if not addressing now

🎯 Follow-up PRs

Implement metrics for force inclusion transaction filtering
Add monitoring/alerting for potential DoS via force inclusion
Document force inclusion rate limits and cost model
Consider adding property-based tests for FilterTxs edge cases

Positive Notes 🎉

Excellent benchmark design - Comprehensive coverage of FilterTxs behavior
Smart optimization - Conditional unmarshaling based on force inclusion presence
Thorough E2E testing - Tests real-world scenarios with proper async handling
Good error handling fix - Tracing now preserves partial results correctly
Clean code - Well-structured, readable, follows Go idioms
Security-conscious - Tests include malicious sequencer detection (even if disabled)

Summary

This PR successfully enables a critical feature (force inclusion) with good test coverage and performance optimization. The code quality is high, with idiomatic Go and comprehensive benchmarks.

Main concerns:

One critical security test remains disabled
Missing test coverage for config validation (causing codecov failure)
Some documentation gaps

Recommendation: Address the two critical issues (malicious sequencer test status and config test coverage) before merging. The rest can be follow-up work.

gemini-code-assist

Code Review

This pull request enables the force inclusion feature. The changes in pkg/config/config.go correctly re-enable the validation logic for the forced inclusion namespace, and the changes in test/e2e/evm_force_inclusion_e2e_test.go re-enable the end-to-end tests for this feature by removing t.Skip() calls. The changes are straightforward and effectively achieve the goal of activating the force inclusion functionality and its associated tests. The pull request title also mentions the addition of force inclusion benchmarks, but no new benchmark files were found in the changes; this might be planned for a subsequent pull request.

codecov · 2026-01-22T10:56:30Z

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.87%. Comparing base (9a91394) to head (2f7b304).

Files with missing lines	Patch %	Lines
pkg/config/config.go	0.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3005      +/-   ##
==========================================
- Coverage   57.99%   57.87%   -0.12%     
==========================================
  Files         110      110              
  Lines       10526    10527       +1     
==========================================
- Hits         6105     6093      -12     
- Misses       3770     3784      +14     
+ Partials      651      650       -1

Flag	Coverage Δ
combined	`57.87% <0.00%> (-0.12%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

test: enable force inclusion, re-enable fi e2e and add fi benchmarks

c92c7d8

github-actions bot assigned julienrbrt Jan 22, 2026

gemini-code-assist bot reviewed Jan 22, 2026

View reviewed changes

julienrbrt added 5 commits January 22, 2026 11:56

add filter tx bench

670313a

updates

055f6ee

Merge branch 'main' into julien/test-fi

4ed0a4b

check others

cab4a67

Merge branch 'main' into julien/test-fi

2f7b304

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: enable force inclusion, re-enable fi e2e and add fi benchmarks #3005

test: enable force inclusion, re-enable fi e2e and add fi benchmarks #3005

julienrbrt commented Jan 22, 2026

Uh oh!

github-actions bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 22, 2026

Uh oh!

claude bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

codecov bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

test: enable force inclusion, re-enable fi e2e and add fi benchmarks #3005

Are you sure you want to change the base?

test: enable force inclusion, re-enable fi e2e and add fi benchmarks #3005

Conversation

julienrbrt commented Jan 22, 2026

Uh oh!

github-actions bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Jan 22, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

claude bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Complete

Overview

1. Code Quality & Architecture

✅ Strengths

⚠️ Areas for Improvement

2. Error Handling & Resilience

✅ Strengths

⚠️ Potential Issues

3. Security

✅ Strengths

⚠️ Considerations

4. Performance & Resource Efficiency

✅ Strengths

⚠️ Performance Considerations

5. Testing & Reliability

✅ Strengths

⚠️ Coverage Gaps

6. Documentation & Developer Experience

✅ Strengths

⚠️ Could Be Better

Critical Issues to Address Before Merge

🔴 High Priority

🟡 Medium Priority

Recommendations

🎯 Before Merge

🎯 Follow-up PRs

Positive Notes 🎉

Summary

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

codecov bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Jan 22, 2026 •

edited

Loading

claude bot commented Jan 22, 2026 •

edited

Loading

codecov bot commented Jan 22, 2026 •

edited

Loading